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DETAILED ACTION 

Claim Rejections - 35 USC § 102 

1 . The following is a quotation of the appropriate paragraphs of 35 U.S.C. 1 02 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

Claims 1-18 are rejected under 35 U.S.C. 102(b) as being anticipated by Goodman (US 
5,999,929 A). 

Goodman teaches: 

Claim 1 

A method comprising: 

receiving a first uniform resource locator (URL) including one or more 
parameters (Column 5 Lines 1-4, "the spider 14 uses URLs to identify Web pages 
to be retrieved for analysis"); 

retrieving content corresponding to the first URL (Column 5 Lines 5- "After the 
spider 14 receives a Web page for analysis, it caches the Web page locally within 
the link referral system"); 

retrieving content corresponding to a plurality of URLs having different 
parameter combinations of the one or more parameters; identifying a parameter 
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combination from the plurality of URLs that corresponds to content that is approximately 
the same as the content corresponding to the first URL; and generating one or more 
URL rewrite rules based on the identified parameter combination (Column 7-8 Lines 
24-53, "In generating the URL re-write rules, the Web page analyzer 15 generally 
processes the URL from the outward most portions of the respective World Wide 
Web addresses, eliminating portions of the respective series, as defined by the 
separators, to determine candidate URLS. For the illustrative URL above, 
HTTP://www.netscape.com/ index.html", candidate URLs will generally include, 
for example, eliminating portions from the beginning of the World Wide Web 
address"). 

Claim 2 

The method of claim 1 , where the different parameter combinations 
include the first URL with no parameters, the first URL with each of the one or 
more parameters individually, and the first URL with combinations of the one or 
more parameters (Column 7 Lines 24-28, "In generating the URL re-write rules, the 
Web page analyzer 15 generally processes the URL from the outward most 
portions of the respective World Wide Web addresses, eliminating portions of the 
respective series, as defined by the separators, to determine candidate URLS", 
and also see Column 7 Lines 28-50). 

Claim 3 

The method of claim 1 , further comprising: 
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performing the receiving a first URL, retrieving content corresponding to 
the first URL, retrieving content corresponding to the plurality of URLs, and identifying 
the parameter combination, for multiple different first URLs that each include the one or 
more parameters; and 

generating the one or more URL rewrite rules for the identified parameter 
combinations for each of the first URLs (See Claim 1 rejection). 

Claim 4 

The method of claim 3, where the rewrite rules specify that 
parameters that do not occur in a threshold number of the identified parameter 
combinations are to be removed (Column 8 Lines 30-33, "After generating the score, 
the Web page analyzer 15 will store the candidate re-write rule in the URL re-write 
rulebase 16B if the score is below a predetermined threshold value"). 

Claim 5 

The method of claim 1 , wherein each rewrite rule applies to a particular 
web site or web host (Column 5 Lines 17-21, "To assist in the duplicate Web page 
consolidation operation, the Web page analyzer 15 develops the URL re-write 
rulebase 16B, which contains rules which are used by the Web page analyzer 15 
to convert URLs to respective canonical forms"). 

Claim 6 
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The method of claim 1 , where the identified parameter combination includes a 
minimum number of parameters (Column 7 Lines 40-50, examples show removing 
portions from the "beginning" and "end" of the World Wide Web address without 
ever actually removing the first unique part of the URL). 

Claim 7 

A method for converting a uniform resource locator (URL) into a 
canonical form of the URL, the method comprising: 

receiving a URL that refers to content and that includes a parameter set 
including at least one parameter (Column 5 Lines 1-4, "the spider 14 uses URLs to 
identify Web pages to be retrieved for analysis"); 

determining a rewrite rule by receiving a plurality of URLs that include the 
parameter set and identifying parameters in the parameter set that do not contribute to 
content (Column 7-8 Lines 24-53, "In generating the URL re-write rules, the Web 
page analyzer 15 generally processes the URL from the outward most portions of 
the respective World Wide Web addresses, eliminating portions of the respective 
series, as defined by the separators, to determine candidate URLS. For the 
illustrative URL above, HTTP://www.netscape.com/ index. html", candidate URLs 
will generally include, for example, eliminating portions from the beginning of the 
World Wide Web address"); 

applying the rewrite rule to the URL by removing the parameters that do not 
contribute to content from the URL; and outputting the rewritten URL as the canonical 
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form of the URL (Column 5 Lines 17-21, "To assist in the duplicate Web page 
consolidation operation, the Web page analyzer 15 develops the URL re-write 
rulebase 16B, which contains rules which are used by the Web page analyzer 15 
to convert URLs to respective canonical forms"). 

Claim 9 

The method of claim 7, where the identifying parameters in the 
parameter set that do not contribute to content includes; retrieving content 
corresponding to a sampled URL including a combination of parameters in the 
parameter set; and identifying the combination of parameters as corresponding to 
retrieved content , where the retrieved content is approximately the same as another 
retrieved content corresponding to another combination of parameters that includes a 
reduced number of parameters (Column 8 Lines 1-9, "If the Web page analyzer 15 
determines in step 2b that the URLs in the entry are not identical to each other, it 
(that is, the Web page analyzer 15) find the shortest substitution rule that 
textually rewrites the longer URL into the shorter URL. For example, the shortest 
rule to change http://www.netscape.com/index.html" to 
HTTP://netscape.com/index.html" is to replace "www." with "" (that is, delete 
"www."). This rule is now a "candidate" rewrite rule"). 

Claim 10 

The method of claim 9, where the combination of parameters 
includes at least one of the sampled URL with no parameters, the sampled URL with 
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individual parameters, or the sampled URL with combinations of the at least one 
parameter (Column 7 Lines 24-28, "In generating the URL re-write rules, the Web 
page analyzer 15 generally processes the URL from the outward most portions of 
the respective World Wide Web addresses, eliminating portions of the respective 
series, as defined by the separators, to determine candidate URLS", and also see 
Column 7 Lines 28-50). 

Claim 11 

The method of claim 7, where the rewrite rule applies to a particular 
web site or web host (Column 5 Lines 17-21, "To assist in the duplicate Web page 
consolidation operation, the Web page analyzer 15 develops the URL re-write 
rulebase 16B, which contains rules which are used by the Web page analyzer 15 
to convert URLs to respective canonical forms"). 

Claim 12 

One or more devices comprising: 

at least one fetch bot configured to download content on a network from 
locations specified by uniform resource locators (URLs) (Column 4 Lines 60-65, 
"spider"); 

a content manager configured to extract URLs from the downloaded 
content (Column 5 Lines 5-10, "Web page analyzer"); 
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a rewrite component configured to receive a URL that refers to content and that 
includes a parameter set including at least one parameter, apply a predetermined 
rewrite rule to the URL that removes the at least one parameter from the URL when the 
at least one parameter does not affect the content referred to by the URL, where the 
predetermined rewrite rule is determined by receiving a plurality of URLs that include 
the parameter set and identifying parameters in the parameter set that do not contribute 
to content; and output the rewritten URL as the canonical form of the URL (Column 5 
Lines 17-21, "To assist in the duplicate Web page consolidation operation, the 
Web page analyzer 15 develops the URL re-write rulebase 16B, which contains 
rules which are used by the Web page analyzer 15 to convert URLs to respective 
canonical forms"); and a URL manager configured to store the canonical form of the 
URL (Column 5 Lines 30-33, "The Web page analyzer 15 stores information 
regarding the identifications for the various classes and the Web page 
assignment information in the link class database 17"). 

Claim 14 

The one or more devices of claim 1 2, where the identifying 
parameters in the parameter set that do contribute to content includes; retrieving 
content corresponding to a sampled URL including a combination of parameters 
in the parameter set; and identifying the combination of parameters as corresponding to 
retrieved content , where the retrieved content is approximately the same as another 
retrieved content corresponding to another combination of parameters that includes a 
reduced number of parameters (Column 8 Lines 1-9, "If the Web page analyzer 15 



Application/Control Number: 10/748,655 Page 9 

Art Unit: 2146 

determines in step 2b that the URLs in the entry are not identical to each other, it 
(that is, the Web page analyzer 15) find the shortest substitution rule that 
textually rewrites the longer URL into the shorter URL. For example, the shortest 
rule to change http://www.netscape.com/index.html" to 
"HTTP://netscape.com/index.html" is to replace "www." with "" (that is, delete 
"www."). This rule is now a "candidate" rewrite rule"). 

Claim 15 

The one or more devices of claim 14, where the combination of 
parameters includes at least one of the sampled URL with no parameters, the sampled 
URL with individual parameters, or the sampled URL with combinations of the at least 
one parameter (Column 7 Lines 24-28, "In generating the URL re-write rules, the 
Web page analyzer 15 generally processes the URL from the outward most 
portions of the respective World Wide Web addresses, eliminating portions of the 
respective series, as defined by the separators, to determine candidate URLS", 
and also see Column 7 Lines 28-50). 

Claim 16 

The one or more devices of claim 12, where each rewrite rule 
applies to a particular web site or web host (Column 5 Lines 17-21, "To assist in the 
duplicate Web page consolidation operation, the Web page analyzer 15 develops 
the URL re-write rulebase 16B, which contains rules which are used by the Web 
page analyzer 15 to convert URLs to respective canonical forms"). 
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Claim 17 

A system comprising: 

means for receiving a first uniform resource locator (URL) including one or 
more parameters (Column 5 Lines 1-4, "the spider 14 uses URLs to identify Web 
pages to be retrieved for analysis"); 

means for retrieving content corresponding to the first URL (Column 5 Lines 5- 
"After the spider 14 receives a Web page for analysis, it caches the Web page 
locally within the link referral system"); 

means for retrieving content corresponding to a plurality of URLs having 
different parameter combinations of the one or more parameters; means for identifying 
the parameter combination from the plurality of URLs that corresponds to content that is 
approximately the same as the content corresponding to the first URL and that contains 
a minimum number of parameters generating one or more URL rewrite rules based on 
the identified parameter combination (Column 7-8 Lines 24-53, "In generating the 
URL re-write rules, the Web page analyzer 15 generally processes the URL from 
the outward most portions of the respective World Wide Web addresses, 
eliminating portions of the respective series, as defined by the separators, to 
determine candidate URLS. For the illustrative URL above, 
HTTP://www.netscape.com/ index.html", candidate URLs will generally include, 
for example, eliminating portions from the beginning of the World Wide Web 
address"); and 
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means for generating one or more URL rewrite rules based on the 
identified parameter combination (Column 5 Lines 17-21, "To assist in the duplicate 
Web page consolidation operation, the Web page analyzer 15 develops the URL 
re-write rulebase 16B, which contains rules which are used by the Web page 
analyzer 15 to convert URLs to respective canonical forms"). 

Claim 18 

A computer-readable memory device including programming instructions 
executed by a processor, the programming instructions comprising: 

instructions for receiving a first uniform resource locator (URL) including 
one or more parameters (Column 5 Lines 1-4, "the spider 14 uses URLs to identify 
Web pages to be retrieved for analysis"); 

instructions for retrieving content corresponding to the first URL (Column 5 
Lines 5- "After the spider 14 receives a Web page for analysis, it caches the Web 
page locally within the link referral system"); 

instructions for retrieving content corresponding to a plurality of URLs 
having different parameter combinations of the one or more parameters; instructions for 
identifying the parameter combination from the plurality of URLs that corresponds to 
content that is approximately the same as the content corresponding to the first URL 
and that includes a minimum number of parameters (Column 7-8 Lines 24-53, "In 
generating the URL re-write rules, the Web page analyzer 15 generally processes 
the URL from the outward most portions of the respective World Wide Web 
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addresses, eliminating portions of the respective series, as defined by the 
separators, to determine candidate URLS. For the illustrative URL above, 
HTTP://www.netscape.com/ index.html", candidate URLs will generally include, 
for example, eliminating portions from the beginning of the World Wide Web 

address"); and 

instructions for generating one or more URL rewrite rules based on the 
identified parameter combination (Column 5 Lines 17-21, "To assist in the duplicate 
Web page consolidation operation, the Web page analyzer 15 develops the URL 
re-write rulebase 16B, which contains rules which are used by the Web page 
analyzer 15 to convert URLs to respective canonical forms"). 

Claim 19 

The system of claim 17, where the parameter combination comprises one of the 
first URL with no parameters, the first URL with each of the one or more parameters 
individually, or the first URL with combinations of the one or more parameters (Column 
7 Lines 24-28, "In generating the URL re-write rules, the Web page analyzer 15 
generally processes the URL from the outward most portions of the respective 
World Wide Web addresses, eliminating portions of the respective series, as 
defined by the separators, to determine candidate URLS", and also see Column 7 
Lines 28-50). 

Claim 20 
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The computer-readable memory device of claim 18, where the instructions for 
receiving a first URL, the instructions for retrieving content corresponding to the first 
URL, the instructions for retrieving content corresponding to a plurality of URLs, and the 
instructions for identifying the parameter combination are performed for multiple first 
URLs, each first URL including the one or more parameters (See claim 18 rejection), 
and where the one or more URL rewrite rules specify that parameters that do not occur 
in a threshold number of the identified parameter combinations are to be removed 
(Column 8 Lines 30-33, "After generating the score, the Web page analyzer 15 will 
store the candidate re-write rule in the URL re-write rulebase 16B if the score is 
below a predetermined threshold value"). 



Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to FARHAD ALI whose telephone number is (571)270- 
1920. The examiner can normally be reached on Monday thru Friday, 7:30am to 
5:00pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Jeffrey C. Pwu can be reached on (571) 272-6798. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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